Abstract: Weather data analytics is very important in every aspect of human life. Weather plays a crucial role in every sectors like agriculture, tourism, government planning, industry and many more. Weather has various parameters like temperature, pressure, humidity and wind speed. The meteorological department from every country has deployed sensors for each weather parameter at various geographical locations. From these sensors weather data is collected on a daily basis. This data is stored mostly in the unstructured format. Due to this, huge amount of data has been collected and archived. Hence, storage and processing of this data for accurate weather prediction is a big challenge. Big data technology like Hadoop and Spark have evolved to solve the challenges and issues of big data using distributed computing. Till date few studies have been reported on the processing of weather data using MapReduce. Similarly, Spark which is the emerging technology claims to be in-memory computing can be applied for weather data analytics. This project presents the analysis of weather data by calculating minimum, maximum and average values of weather parameters. The code is implemented in both MapReduce and Spark to study their relative performance for the weather data analytics.
Keywords: Big Data, Hadoop, Spark, MapReduce, Weather Data Analytics.